This report was generated using iSHARC.For additional
downstream analyses and relevant tools, please refer to COBE.
QC metrics and filtering
Selected percentiles for QC metrics
- nCount_RNA: The number of transcripts detected per cell.
- pct_MT: The percentage of reads originating from the mitochondrial
genes.
- nCount_ATAC: The number of unique nuclear fragments.
- TSS_Enrichment: The ratio of fragments centered at TSS to those in
TSS-flanking regions.
- Nucleosome_Signal: The approximate ratio of mononucleosomal to
nucleosome-free fragments.
Number of joint called cells by Cellranger-ARC: 5118
| 0% |
40.00 |
0.0000000 |
0.00 |
0.0000000 |
0.2500000 |
| 2.5% |
150.00 |
0.0000000 |
8.00 |
0.3868724 |
0.4705134 |
| 25% |
407.00 |
0.2777778 |
29.00 |
2.0379620 |
0.6453259 |
| 50% |
605.00 |
0.5687209 |
60.00 |
2.9970030 |
0.7291028 |
| 75% |
983.75 |
1.1075949 |
119.00 |
4.1958042 |
0.8141078 |
| 97.5% |
6377.80 |
3.6036036 |
488.15 |
10.3946054 |
1.0284713 |
| 100% |
38992.00 |
14.6666667 |
2627.00 |
29.5704296 |
3.0000000 |
Violin plots for selected QC metrics

Second-round QC filtering assessment
The automatic QC metric cutoffs are determined based on the suggested
thresholds and the corresponding median ± 3*MAD (Median Absolute
Deviation), as outlined below. The subsequent results for the cells in
this report depend on whether the second-round QC filtering is enabled
in the configuration file.
- nCount_RNA: [median - 3MAD, median + 3MAD]
- nCount_ATAC: [median - 3MAD, median + 3MAD]
- pct_MT: < min(20, median + 3*MAD)
- TSS_Enrichment: > max(1, median - 3*MAD)
- Nucleosome_Signal: < min(2, median + 3*MAD)
| Number of cells |
5118 |
0 |
0 |
0 |
| Fraction of 1st filtered cells |
1 |
0 |
0 |
0 |
Cell cycle assessment
- Exercise caution when analyzing cells undergoing differentiation
processes (e.g., hematopoiesis).
- PCA plots are provided both before and after regressing out the
effects of the cell cycle. Only cell cycle-related genes are used for
the PCA analysis.

Integration and Clustering
ATAC, RNA and integrated WNN clustering

Modality weights for integrated WNN clusters
Higher ATAC weights indicate that the epigenome has a greater
influence on the corresponding cell clusters. Similarly, higher RNA
weights suggest that the transcriptome plays a more significant role in
the corresponding cell clusters.


Clustering changes across individual and integrated modalities
The interactive Sankey plot below enables the tracking of clustering
changes between individual and integrated modalities.
Auto cell annotation
Annotation using publicly available references
The Blueprint/ENCODE reference from the R package celldex
is used for automatically annotating each cell cluster using SingleR.

Inferring tumor and normal cells
Copykat is
employed to classify cells into the following categories: aneuploid
(tumor cells); diploid (stromal normal cells); not.defined (Cells taht
cannot be to be predicted by copykat); not.predicted (cells excluded
from copykat prediction).

Cluster-specific genes
Top 5 upregualted genes per WNN cluster
The ATAC peaks linked to these top 5 genes can be found in the file
Met_lung_top5_DEGs_linked_peaks.csv. This file will be empty if no genes
passed the preset cut-offs.

Functional enrichment analysis for WNN cluster-specific genes

Cluster-specific regulatory regions
Top 5 enriched TF motifs per WNN cluster. * NOTE: The heatmap is
generated from the combined top-enriched TF motifs, saved in
Seurat_object@assays$ATAC@misc$DARs_motif_hm. Values with -log10(p_adj)
> 10 are capped at 10. The heatmap will be empty if no motifs are
enriched or if they do not pass the preset cutoffs.
Gene Regulatory Network
The Pando is employed
for GRN analysis:
- Edges Color: Darkgrey (Inhibitory); Orange (Activating)
- Nodes Color: Lightgrey (DEG); Brown (TF); Brown with Black circle
(DEG & TF)
- Nodes Size : Based on their centrality in the graph
- It will be empty if no GRNs are identified with preset cutoffs.
GRN based on combined WNN cluster-specific genes
GRN based on individual WNN cluster-specific genes
list of main outputs
Three depth of Seurat objects
- Met_lung_initial_seurat_object.RDS: The initial Seurat object
containing RNA and ATAC assays, with peaks re-called using MACS. No
additional analyses are performed.
- Met_lung_vertically_integrated_seurat_object.RDS: This Seurat object
includes the optional second-round QC, cell cycle correction,
normalization, and clustering for RNA and ATAC data. RNA and ATAC data
are integrated using WNN.
- Met_lung_extended_seurat_object.RDS : The extended Seurat object
contains the results for additional assays based on the WNN integrated
clusters, including intermediate and final results for all assays,
particularly:
- Seurat_object@assays$ATAC: The ATAC assays performed by calling
peaks with MACS.
- Seurat_object@assays$SCT@misc$DEGs: The WNN cluster-specific
differentially expressed genes (DEGs)
- Seurat_object@assays$ATAC@misc$DARs: The WNN cluster-specific
differential accessible regions (DARs)
- Seurat_object@assays$ATAC@misc$DARs_motif: The motif enrichment
results for top cluster-specific DARs (p_adj < 0.005 )
- Seurat_object@assays$ATAC@links: The peaks linked to the top 5 WNN
cluster-specific DEGs
- Seurat_object@assays$ATAC@links: The peaks linked to the top 5 WNN
cluster-specific DEGs
- scMultiome@misc$Combined_DEGs_GRN: tbl_graph object for combined
DEGs
- scMultiome@misc$Cluster_DEGs_GRN: tbl_graph objects for WNN
cluster-specific DEGs
Tables
- Met_lung_extended_metaData.csv : A metadata table that includes QC
metrics, cell cycle phases, clustering results for RNA, ATAC, and WNN
integration, SingleR annotations, and CopyKAT tumor cell predictions for
each cell.
- Met_lung*DEGs_GRN_Modules.csv : CSV files for the DEGs-based GRN
modules.
RMD file
- Met_lung_QC_and_Primary_Results.Rmd: The Rmd file for generating
this HTML report.
Additional tables and figures are available in the directory of
individual samples: workdir/individual_samples/Met_lung!!